4 research outputs found

    Confidential Consortium Framework: Secure Multiparty Applications with Confidentiality, Integrity, and High Availability

    Full text link
    Confidentiality, integrity protection, and high availability, abbreviated to CIA, are essential properties for trustworthy data systems. The rise of cloud computing and the growing demand for multiparty applications however means that building modern CIA systems is more challenging than ever. In response, we present the Confidential Consortium Framework (CCF), a general-purpose foundation for developing secure stateful CIA applications. CCF combines centralized compute with decentralized trust, supporting deployment on untrusted cloud infrastructure and transparent governance by mutually untrusted parties. CCF leverages hardware-based trusted execution environments for remotely verifiable confidentiality and code integrity. This is coupled with state machine replication backed by an auditable immutable ledger for data integrity and high availability. CCF enables each service to bring its own application logic, custom multiparty governance model, and deployment scenario, decoupling the operators of nodes from the consortium that governs them. CCF is open-source and available now at https://github.com/microsoft/CCF.Comment: 16 pages, 9 figures. To appear in the Proceedings of the VLDB Endowment, Volume 1

    Documentation of clinically relevant genomic biomarker allele frequencies in the next-generation FINDbase worldwide database

    Get PDF
    FINDbase (http://www.findbase.org) is a comprehensive data resource recording the prevalence of clinically relevant genomic variants in various populations worldwide, such as pathogenic variants underlying genetic disorders as well as pharmacogenomic biomarkers that can guide drug treatment. Here, we report significant new developments and technological advancements in the database architecture, leading to a completely revamped database structure, querying interface, accompanied with substantial extensions of data content and curation. In particular, the FINDbase upgrade further improves the user experience by introducing responsive features that support a wide variety of mobile and stationary devices, while enhancing computational runtime due to the use of a modern Javascript framework such as ReactJS. Data collection is significantly enriched, with the data records being divided in a Public and Private version, the latter being accessed on the basis of data contribution, according to the microattribution approach, while the front end was redesigned to support the new functionalities and querying tools. The abovementioned updates further enhance the impact of FINDbase, improve the overall user experience, facilitate further data sharing by microattribution, and strengthen the role of FINDbase as a key resource for personalized medicine applications and personalized public health

    Space Efficient Data Structures for N-gram Retrieval

    No full text
    A significant problem in computer science is the management of large data strings and a great number of works dealing with the specific problem has been published in the scientific literature. In this article, we use a technique to store efficiently biological sequences, making use of data structures like suffix trees and inverted files and also employing techniques like n-grams, in order to improve previous constructions. In our attempt, we drastically reduce the space needed to store the inverted indexes, by representing the substrings that appear more frequently in a more compact inverted index. Our technique is based on n-gram indexing, providing us the extra advantage of indexing sequences that cannot be separated in words. Moreover, our technique combines classical one level with two-level n-gram inverted file indexing. Our results suggest that the new proposed algorithm can compress the data more efficiently than previous attempts

    Predicting Secondary Structure for Human Proteins Based on Chou-Fasman Method

    No full text
    Part 2: 8th Mining Humanistic Data WorkshopInternational audienceProteins are constructed by the combination of a different number of amino acids and thus, have a different structure and folding depending on chemical reactions and other aspects. The protein folding prediction can help in many healthcare scenarios to foretell and prevent diseases. The different elements that form a protein give the secondary structure. One of the most common algorithms used for secondary structure prediction constitutes the Chou-Fasman method. This technique divides and in following analyses each amino acid in three different elements, which are -helices, -sheets and turns based on already known protein structures. Its aim is to predict the probability for which each of these elements will be formed. In this paper, we have used Chou-Fasman algorithm for extracting the probabilities of a series of amino acids in FASTA format. We make an analysis given all probabilities for any length of a human protein without any restriction as other existing tools
    corecore